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■ Abstract. We analyze whether there is any residual foreground contamination in the 
Ci ■ cleaned WMAP 7 years data for the differential assemblies, Q, V and W. We calculate 

the correlation between the foreground map, from which long wavelength correlations 
have been subtracted, and the foreground reduced map for each differential assembly 
^ . after applying the Galaxy and point sources masks. We find positive correlations for 

all the differential assemblies, with high statistical significance. For Q and V, we 

■ find that a large fraction of the contamination comes from pixels where the foreground 
O^I I maps have positive values larger than three times the rms values. These findings imply 

■ the presence of residual contamination from Galactic emissions and unresolved point 

sources. We redo the analysis after masking the extended point sources cataloque of 
Scodeller et al. [7] and find a drop in the correlation and corresponding significance 
values. To quantify the effect of the residual contamination on the search for primordial 
non-Gaussianity in the CMB we add estimated contaminant fraction to simulated 

■ Gaussian CMB maps and calculate the characteristic non-Gaussian deviation shapes 
^ . of Minkowski Functionals that arise due to the contamination. Wc find remarkable 

agreement of these deviation shapes with those measured from WMAP data, which 
imply that a major fraction of the observed non-Gaussian deviation comes from residual 
foreground contamination. We also compute non-Gaussian deviations of Minkowski 
Functionals after applying the point sources mask of Scodeller et al and find a decrease 
in the overall amplitudes of the deviations which is consistent with a decrease in the 
level of contamination. 
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1. Introduction 

The cosmic microwave background (CMB) radiation and the large scale structures in 
the universe carry a wealth of cosmological information. Observational data support the 
cosmological models dominated by cold dark matter and the cosmological constant [H |2] 
(see also [3] for a critical review of the current cosmological models). In the case 
of the CMB the correct extraction of cosmological information crucially depends on 
our ability to measure the true CMB signal. In practice, the experimentally observed 
CMB temperature fluctuations is composed of the true CMB signal and foreground 
signals coming from astrophysical sources that emit photons in the frequency ranges 
spanned by the observations. The major part of the foreground component comes from 
diffuse emissions from our Galaxy, and a small fraction comes from extra-Galactic point 
sources |1] such as radio galaxies and dusty star-forming galaxies. The Galaxy emissions 
consist of thermal and spinning dust emissions, free-free emissions from electrons-ion 
scattering, synchrotron radiation from shock accelerated electrons interacting with the 
Galactic magnetic field and a component called the 'haze' whose physical origin is not 
yet understood. These foreground components are usually estimated based on templates 
and then subtracted from the observed data [5] . 

The WMAP data release [6] includes masks for our Galaxy and for extra-Galactic 
point sources which have been identified |5]. Henceforth, we refer to the point sources 
mask provided by the WMAP team as PSl. Recently, Scodeller et al. [7] reported the 
detection of new point sources in the WMAP data that have not been reported before. 
They provide two extended masks [8], which we refer to as PS2 and PS3, and they 
include the sources identified by the WMAP team as subsets. PS2 has 1116 sources 
outside the KQ85 Galactic mask, which were detected either at 5cr directly in any of 
the 5 WMAP channels or at 5cr in internal templates and at 3cr in any of the channels. 
PS3 has 2102 sources outside the KQ85 Galactic mask, which were detected either at 
5cr directly in any of the 5 WMAP channels or at ba in internal templates. 

The goal of this paper is twofold. The first goal is to investigate whether there 
is small but statistically significant residual foreground contamination in the cleaned 
and masked WMAP data. Our method is based on calculating correlations between the 
foreground field, which has been processed so as to remove long wavelength correlations 
of the galaxy emissions, and the cleaned CMB data. Our basic premise is that if the 
there is no residual contamination in the cleaned and masked data we should obtain 
no correlation. However, we find statistically significant positive correlation for WMAP 
7 years data for the Q, V and W differential assemblies (DAs) where we have applied 
the KQ75 galactic mask and PSl. We further find that a big fraction (as big as 30% 
for Q channel) comes from regions where the foreground map has large positive values, 
which indicates unresolved point sources. These results give a clear indication that 
there are residual foreground contamination in the cleaned data. A brief report of these 
results has been presented in [9]. We redo the above calculation of correlation after 
applying PS2 and PS3. As is reasonable to expect, we find a decrease in the value of the 
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correlations and a corresponding decrease in the statistical significance of those values, 
implying that these newly identified point sources have non-trivial contribution to the 
correlations. 

Our second goal is to study the effect of the residual contamination on the 
estimation of non-Gaussianity parameters by using Minkowski Functionals (MFs) [TOl 
[TTl [T2| [T3] . To this end we add estimated contaminant fraction to Gaussian CMB 
simulations and calculate their effect on the MFs. A comparision between the 
characteristic non-Gaussian deviation shapes of the MFS that result from the residual 
contamination and the non-Gaussian deviation shapes of WMAP data using PSl reveals 
a remarkable similarity. From this we conclude that the non- Gaussian deviations 
seen in MFS measured from WMAP data must come predominantly from the residual 
foreground contamination. Further, in order to isolate the effect of the new point sources 
contained in PS2 and PS3 we redo the calculation of non-Gaussian deviation of MFs 
from WMAP data after masking them. We find that the first MF is very strongly 
affected and the non-Gaussian deviation shape is completely modified. The effect on 
the other two are milder, with the non-Gaussian deviation shapes more or less unaltered 
and a decrease in the amplitude of the deviations. This can be attributed to the fact that 
masking the new point sources leads to a decrease in the level of residual contamination. 
Earlier studies of the effects of contamination on the CMB have mostly focused on point 
sources [HI CSl UHl [IT] . An investigation of the effect of point sources on MFs was done 
in [11]. 

This paper is organized as follows. In section 2 we present calculations of the 
correlations between the foreground and cleaned CMB maps and their statistical 
significance after applying point sources masks PSl, PS2 and PS3. In section 3, we 
compute MFs from Gaussian simulations to which a fraction of the foreground field 
is added and compare the non-Gaussian deviations to the corresponding deviations 
measured from WMAP 7 years data using PSl. We further study the effect of masking 
the additional point sources in PS2 and PS3 on the MFs. We end with concluding 
remarks in section 4. 

2. Quantifying residual foreground contamination 

We begin with the expectation that any two random fluctuation flelds that originate 
from completely different physical processes will not have any correlation. Let / and 
/' be two random flelds that have zero mean values, deflned on the surface of a two 
dimensional sphere. Let their rms values be denoted by (Tq and ctq, respectively. By 
rescaling them as = /(i)/o"o and iy'{i) = f'{i)/o-Q, where i denotes the pixel number, 
we can deflne a correlation parameter, r, as 

r =< u{i) u'{i) >, (1) 

where the bracket denotes average over all pixels. We expect r to be zero if the two 
flelds are uncorrelated and non- zero otherwise. In numerical calculations we will always 
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get a non-zero value of r even for two fields that are known to be uncorrelated, due to 
the finite number of pixels, and we need to further test statistically whether the value 
is small enough to be be considered as practically zero. 

The observed WMAP data, f°^^, is a sum of the true CMB signal and foreground 
contamination. Let us call the foreground component that is estimated using 
a combination of galaxy observation and theoretical modeling, as the 'apparent' 
foreground field, denoted by /^pp^, keeping in mind that there may be small error 
in its estimation. This field is then subtracted pixel by pixel from f°^^ to leave behind 
the 'cleaned' CMB signal, which we denote by J^icancd^ gy definition, J^ieaned j^^^g ^ero 
mean. If /'^pp^s ]-^ag been correctly estimated, then we expect it to have negligibly 
small correlation with J^i^a-ii'^'i^ since they come from totally different physical processes. 
However, if the estimation is on the right track but not fully correct, then we should 
expect some residual contamination in the signal field. This should show up as non-zero 
correlation between J^ieaned jappfg_ 

2.1. Peak field 

Our analysis is done using the 7 years data from the eight differential assemblies (DAs) 
of WMAP, namely, Qi, Qs, Vi, V2, Wi, W2, W3 and W4. For each DA the Galactic 
foreground is obtained from 

yappfg yobs ycleaned 

where 'appfg' indicates that /^pp^s jg \^\^q 'apparent' foreground and a fraction of it may 
be left behind in f'^^'^'^'^^'^ _ To examine the existence of the residual foreground in the 
cleaned map we will compare the cleaned map with the foreground maps where the 
large-scale variation of the Galactic emissions is removed. A foreground map with the 
large-scale variation subtracted is called the peak field and is defined as 

ypcak _ ^yappfg,6»3 _ yappfgjS^s^ _ ^ yappfg.es _ yappfg,36»s ^ ^g-^ 

where 6s and 36s are FWHM values at which we have smoothed the field. By definition 
jpeak ]-^g^g 2;grQ mean. The left panel of Fig. ([1]) shows the peak field for Ql channel. In 
the right panel we have shown pixels (in white) of the same peak field which have values 
above ScjP'^^'^. 

To reduce the boundary effects by controlling the degree of masking we use the 
foreground mask map smoothed over FWHM= 36s. The pixels of this map have value 
one well inside the mask boundaries and zero well outside, but have values between zero 
and one near the boundaries. The distance from the original mask boundaries is then 
encoded in the pixel values of the smoothed mask map. By using some threshold value, 
Smask, of the smoothed mask pixels, we can control how far away we stay away from 
the mask boundary. Staying 2a away from the boundary corresponds to choosing pixels 
with Smask > 0.89. As we choose larger values of Smask we stay further away from the 
boundaries and the sky fraction decreases. 
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Figure 1. Left panel: Peak field for Qi DA. Right panel: Locations of pixels where 
the same peak field has values above Sa^^^^ are shown in white. 
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Table 1. Tc values for the eight DAs are shown. Point sources mask used is PSl. 
For each DA and Smask, the upper value gives r^ calculated using all unmasked 
pixels, while the lower value has been calculated after excluding pixels having 
j^peak ^ 2 also. The sky fractions for the three Smask values from top to bottom 
are roughly 62%, 60% and 58%. 



2.2. Correlation between peak and cleaned CMB fields 
Let us denote 



f cleaned/ j\ xpeak/'N 

^cleaned. ^^ ^ J ^peak.^) _ J_ W U) 

^ ' ^cleaned ^ ' jjpeak ' ^ ' 

and define 

z/^^'^'^^^'^z/P"^'^ >^,^, (5) 

where the suffix 9^ is to remind us that we do the calculation for a choice of FWHM at 
which J'^i^'a-ii'^d has also been smoothed. o"<=ieaned ^peak ^.j^g orders of 10~^ and 

10~^, respectively. Table ([I]) summarizes the main results for Tc where point sources 
mask PSl has been used. We have chosen 9s = 35' based on the resolution of QI 
channel. Two values of rc are shown for each DA and Smask- The upper value is the case 
where rc is calculated using all unmasked pixels, while the lower value is the case where 
pixels with i/P'=^'^ > 3^ shown in the right panel of Fig. ([1]) for QI, have been excluded. 
The first observation we make is that all rc values are positive. For Q channels we get 
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Table 2. Left: Number of maps, A^, having > for individual DAs, out of 
1000 Gaussian maps for PSl. As in Table ([TJ, upper values are for all unmasked 
pixels included, while lower values are for the case when pixels with i/P'^"-^ > 3 
have also been excluded. Right: Number of Gaussian maps, Nq, having rg > rc 
simultaneously for all DAs. 



considerably larger correlation when we keep all unmasked pixels, larger by about 30%. 
This indicates that there is non-trivial correlation arising from the pixels with i/P'^^^ ^ ^ 
For V channels the difference is about 20% while W channels don't seem to be affected. 
The sky fractions for the three Smask values are roughly, 62%, 60% and 58%, respectively. 
For Q and V channels, as we stay further away from the mask boundaries there is small 
but systematic decrease of r^. 

2.3. Statistical significance of rc values 

We investigate how likely it is to get the observed r^. values given in Table ([T]) by 
comparing with correlations between the peak field and Gaussian CMB simulations. 
For this purpose we simulate 1000 Gaussian CMB maps with WMAP 7 years parameter 
values, add pixel window effect, beam smearing and WMAP 7 years noise characteristics. 
Next we smooth by FWHM 35' and mask in exactly the same way as we did when 
calculating rc and calculate the correlation with the peak field. We denote the correlation 
value by rg. The Gaussian fields are uncorrelated with the signal field and we should 
get small value of rg. This exercise will tell us what is the typical value of 'small' rc 
that we can approximate to be zero for the number of pixels under consideration and 
how likely are our observed rc values to occur by random fluctuation and not due to a 
true correlation. 

We count, out of the thousand r^ values, how many are greater than r,,. The results 
are shown in Table ([2]). The left table shows the number, A^, of Gaussian maps having 
rg > rc for each individual DA, for the three Smask values used earlier for calculating 
rc and including/excluding the pixels having i/P*^^^ > 3. When all unmasked pixels are 
included, we get A^ = for all Smask values for Q channel, for V channels A^ lies between 
5 and 14, while for W channels A^ lies between 120 and 294. These numbers imply 
that the rc values for Q and V are statistically significant, whereas, the values for W 
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Table 3. values for the eight DAs after applying PS2. Y As in Table ([T]), 
for each DA and Smaski the upper value gives rc calculated using all unmasked 
pixels, while the lower value has been calculated after excluding pixels having 
j^peak ^ 2 also. The sky fractions for the three Smask values from top to bottom 
are roughly 61%, 59% and 57%. 
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Table 4. Left: Number of maps, N, having > Tc for individual DAs, out of 
1000 Gaussian maps, calculated after applying PS2. As in Table ([2]), upper 
values are for all unmasked pixels included, while lower values are for the 
case when pixels with u'P'^"-^ > 3 have also been excluded. Right: Number 
of Gaussian maps, Nq, having rg > rc simultaneously for all DAs. 



channels have much lower significance. When pixels with u^'^^ > 3 are also excluded, 
we find a reduction of for all the channels. The table on the right side of Table (E]) 
shows the number, No, of Gaussian maps having > simultaneously for all DAs. 
These values are again significant. Therefore, we conclude that the cleaned WMAP 
data, particularly Q and V channels, contain small but statistically significant amount 
of residual foreground contamination. 
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2.4- Correlation between peak and cleaned CMB fields after applying extended point 
sources masks PS2 and PS3 

We have repeated the calculation of Tc and the analysis of the statistical significance 
after masking after applying PS2. Masking the new point sources results in a further 
decrease of roughly 1% of the sky fraction. The values that we obtain are shown in 
Table ([3]). The significance test results are shown in Table (jlj). We find a clear decrease 
in the correlation values and their statistical significance which indicates that there is 
reduction in the residual contamination, as should be expected . 

3. Minkowski P\inctionals and residual foreground 

The morphological properties of excursions sets of the CMB (the set of all pixels having 
temperature fluctuation values greater than or equal to some threshold value, u) can be 
neatly captured by the so called Minkowski Functionals. There are three MFs that are 
relevant for the CMB. The first is the area fraction, Vq, of the excursion set, the second 
is the total length, Vi, of iso-temperature contours or boundaries of the excursion sets 
and the third is the genus, V2, which is the difference between the numbers of hot and 
cold spots |12]. For a Gaussian random field the MFs are given by, 

Vkiu) = Ak Hk^,{u) e-^"l\ A; = 0, 1, 2. (6) 

}in{y) is the n-th Hermite polynomial and the amplitude depends only on the angular 
power spectrum Ci. It is given by 

o]^^Y.^^l^r)W^^)\'CiWl (8) 

with cjfc = 7r*^/Vr(fc/2 + l). 0"! is the rms of the gradient of the field and Wi represents 
the smoothing kernel determined by the pixel and beam window functions and any 
additional smoothing. The presence of any small deviation from Gaussianity will appear 
as deviations from these formulas. The MFs are useful because they have characteristic 
non-Gaussian deviation shapes for different types of non- Gaussianity and can distinguish 
them. They carry information of all orders of n-point functions and this makes them 
unbiased towards specific forms of non-gaussianity. 

For the numerical computation of MFs for any given random field we use the method 
described in [19]. This method was shown to have numerical inaccuracies which are of 
specific forms arising from the finite approximation of the delta function and which 
scales as the square of the finite binning of the temperature threshold values at leading 
order [20]. In our calculations we estimate and subtract these inaccuracies and we 
denote the corrected result by Vl^'^ ■ For weakly non-Gaussian fields we can obtain the 
Gaussian component by using the formula Eq. ([6]) where the amplitude is computed by 
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measuring (Jq and ai directly from the field. We denote it by Vf'. The non-Gaussian 
deviation is then given by 

AV, = V^"" - y,^. (9) 
3.1. Effect of residual contamination on Minkowski Functionals 
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Figure 2. Minkowski functionals for the peak fields. 

In this subsection we study how the residual foreground contamination affects the 
MFs. We begin by examining the shapes of the MFs for the peak fields shown in Fig. ([2]) 
for each of the DAs. It is obvious that they have strong departures from Gaussian shapes. 
We can therefore expect that if any small fraction of the peak fields contaminate the 
CMB field it will show up as non-Gaussian deviation in the MFs. 

In order to mimic and quantify the effect of the residual contamination on the MFs 
we add ef^"^^^ to Gaussian simulated maps, as, 

^contaminated jG _j_ ^ypeak (10) 

where f'^ is the simulated Gaussian map, to which we have added instrumental effects, 
as descrived in section (12. 3p . Note that the largest contribution to the non-Gaussian 
deviation of the MFs arising from non-zero e will scale linearly with it [21]. The MFs 
and their non-Gaussian deviations (yellow dots) computed from y contaminated^ averaged 
over 1000 maps, are shown in Fig. (|3]). We have chosen values of e which result in 
amplitudes of the MFs similar to the observed ones. The e value used for these plots 
is e = 8rc/{rc + crP°ak^^cieancd-j each respective DA. In the same figure we have also 
shown the AVi computed from the WMAP 7 years data (red dots) after applying PSl. 
As seen in the figure, there is remarkable agreement between the two plots. We infer that 
most of the non-Gaussian deviation that we measure in the WMAP data is contributed 
by residual foreground contamination. 



3.2. Effect of PS2 and PS3 on Minkowski Functionals from WMAP 

We calculate the MFs for WMAP 7 years data after applying PS2 and PS3. The non- 
Gaussian deviations are shown in Fig.(jlj), along with the result of PSl so as to compare 
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Figure 3. Minkowski Functionals and their non-Gaussian deviations (Eq. ([9])), 
measured from Gaussian simulations to which residual contaminant fraction has been 
added, given by Eq. ([T0|) . Average over 1000 simulations. 



the three. The Galaxy mask apphed is KQ75 as done in previous sections. We find that 
AVo(i^) is strongly affected by the removal of the new point sources, it flips sign. This is 
simple to understand, as explained below. Let denote the total number of unmasked 
pixels. At any u let us denote the number of pixels greater than or equal to u by nlu). 
Vo(z/) is given by n{u)/N. When we mask new point sources we exclude, say m, positive 
valued pixels. Then the effect of the new masking gives new value Vq^u). Since m is 
positive, this implies V"q(z/) < Vo(z/). If we started with Vq which is greater than the 
Gaussian value and this decrease makes Vq less than the Gaussian value, then AVq will 
flip sign which is the case here. This suggests that Vo(i^) is unreliable for extraction 
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Figure 4. Non-Gaussian deviations of Minkowski functionals for the eight DAs of 
WMAP 7 years data for tlie three cases where point sources masks PSl, PS2 and PS3 
were applied. 



of non-Gaussianity information due to our imprecise knowledge of point sources in the 
sky. For AVii^u) the amphtude is decreased considerably but the shape is more or less 
unaffected. The genus is affected the least and the main effect is a reduction of the 
non-Gaussian deviation around u = 0. These effects are due to the reduction in the 
level of contamination due to the masking of the new point sources. 

4. Conclusion 

We have analysed the cleaned WMAP 7 years data with the goal of quantifying the 
amount of residual foreground contamination outside the Galactic and point sources 
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masks and the resulting bias in the estimates of primordial non-Gaussianity by using 
Minkowski Functionals. The presence of significant residual contamination is confirmed 
by calculations of correlations between the cleaned maps and the foreground maps which 
give values that are found to be statistically significant. The Q channel is found to have 
the strongest correlations and hence largest residual contamination while W channel 
has the least. For Q and V channels we found that a big fraction of the contamination 
come from pixels where the foreground fields have large values. A comparision of 
the correlation and significance values obtained after applying the point sources mask 
provided by WMAP and the extended masks of Scodeller et al. reveals that the extended 
masks remove some fraction of the residual contamination, as should be expected. 

The above results have important implications for the extraction of cosmological 
parameters from observational data, particularly on the search for primordial non- 
Gaussianity, using the cleaned WMAP data. In order to understand the implications 
we simulate contaminated CMB maps by adding a fraction of the foreground field to 
Gaussian maps and measure Minkowski Functionals from them. The non-Gaussian 
deviation shapes of all the three MFs are found to have remarkable agreement with 
what is measured from the cleaned WMAP data after applying PSl. Non-Gaussian 
deviations of MFs calculated after applying PS2 and PS3 give a reduction in the overall 
magnitude of the deviations compared to PSl. AVq actually changes sign owing to 
its strong sensitivity to point sources and is not reliable to be used for constraining 
primordial non-Gaussianity. The shapes of AV^i and AV2 are relatively insensitive to 
the different masking with the main effect being a decrease in the amplitude. These 
results are consistent with a reduction in the level of residual contamination. 

We conclude that the cleaned WMAP 7 years data contains significant amount of 
residual foreground contamination, both from diffuse Galactic emissions and unresolved 
extra-Galactic point sources. Note that more than 15000 point sources have already 
been identified from data from the PLANCK satellite |23] . A rough visual comparision 
between the amplitudes of A^j in Figs. (j3]) and (jl]) and the corresponding amplitudes 
of deviations due to the local primordial non-Gaussianity parameter /nl [21] tells us 
that the cleaned data contains residual contamination of similar levels as /nl > 100. 
Unless this is removed the constraints put on primordial non-Gaussianity parameters, 
such as in [22], are not sensible. Our calculations may be refined to obtain a good 
estimate of the residual contamination fraction encoded in the parameter e. It can then 
be further subtracted from the cleaned data and the resulting maps can be used to search 
for primordial non-Gaussianity. It is also imperative that we redo such analysis after 
applying the extended point sources masks PS2 and PS3. We are currently initiating 
this investigation. 
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